Detection of quotations and inserted clauses and its application to dependency structure analysis in spontaneous Japanese

نویسندگان

  • Ryoji Hamabe
  • Kiyotaka Uchimoto
  • Tatsuya Kawahara
  • Hitoshi Isahara
چکیده

Japanese dependency structure is usually represented by relationships between phrasal units called bunsetsus. One of the biggest problems with dependency structure analysis in spontaneous speech is that clause boundaries are ambiguous. This paper describes a method for detecting the boundaries of quotations and inserted clauses and that for improving the dependency accuracy by applying the detected boundaries to dependency structure analysis. The quotations and inserted clauses are determined by using an SVM-based text chunking method that considers information on morphemes, pauses, fillers, etc. The information on automatically analyzed dependency structure is also used to detect the beginning of the clauses. Our evaluation experiment using Corpus of Spontaneous Japanese (CSJ) showed that the automatically estimated boundaries of quotations and inserted clauses helped to improve the accuracy of dependency structure analysis.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dependency structure analysis and sentence boundary detection in spontaneous Japanese

This paper addresses automatic detection of dependencies between Japanese phrasal units called bunsetsus, and sentence boundaries in a spontaneous speech corpus. In spontaneous speech, the biggest problem with dependency structure analysis is that sentence boundaries are ambiguous. In this paper, we propose two methods for improving the accuracy of sentence boundary detection in spontaneous Jap...

متن کامل

Word-level Dependency-structure Annotation to Corpus of Spontaneous Japanese and its Application

In Japanese, the syntactic structure of a sentence is generally represented by the relationship between phrasal units, bunsetsus in Japanese, based on a dependency grammar. In many cases, the syntactic structure of a bunsetsu is not considered in syntactic structure annotation. This paper gives the criteria and definitions of dependency relationships between words in a bunsetsu and their applic...

متن کامل

Analysis of parenthetical clauses in spontaneous Japanese

In this paper, I will discuss the functional aspects of parenthetical clauses and sentences in spontaneous Japanese monologues. Parentheticals can be defined as syntactic elements that are instantly inserted in the middle of an ongoing utterance to add supplemental information and thus interrupts the fluent flow of speech production. Examples of parenthetical clauses/sentences that appeared in ...

متن کامل

A Study of Quranic Quotations in Iqbal’s Poetry: an Intertextual Approach

The present article studies Quranic quotations in the poetry of Mohammad Iqbal Lahori based on the approach of intertextuality. Iqbal is one of the greatest poets and intellectuals of the Eastern Muslim world. Quran is a source of inspiration both in his life and poetic career. His poetry is interwoven with Quran through intertextual Quotations. These Quranic quotations are central to the produ...

متن کامل

Application of Clayton Copula in Portfolio Optimization and its Comparison with Markowitz Mean-Variance Analysis

With the aim of portfolio optimization and management, this article utilizes the Clayton-copula along with copula theory measures. Portfolio-Optimization is one of the activities in investment funds. Thus, it is essential to select an appropriate optimization method. In modern financial analyses, there is growing evidence indicating the distribution of proceeds of financial properties is not cu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006